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Abstract. The control of large queueing networks is a notoriously difficult problem. Recently, an interesting new 
policy design framework for the control problem called h-MaxWeight has been proposed: h-MaxWeight is a natural 
generalization of the famous MaxWeight policy where instead of the quadratic any other surrogate value function can be 
applied. Stability of the policy is then achieved through a perturbation technique. However, stability crucially depends 
on parameter choice which has to be adapted in simulations. In this paper we use a different technique where the required 
perturbations can be directly implemented in the weight domain, which we call a scheduling field then. Specifically, 
we derive the theoretical arsenal that guarantees universal stability while still operating 'close' to the underlying cost 
criterion. Simulation examples suggest that the new approach to policy synthesis can even provide significantly higher 
gains irrespective of any further assumptions on the network model or parameter choice. 
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1. Introduction. Control policy synthesis for stochastic queueing networks has a multitude of 
practical applications ranging from the Internet and other data networks over transport networks, 
manufacturing networks, and power distribution networks in industry to mobile ad-hoc networks. 
Typically, there are two underlying design criteria: 1) throughput optimality (i.e. the policy supports 
every set of arrival rates which can potentially be supported by any other algorithm) and 2) optimality 
with respect to some predefined cost criterion. In the literature there exist a vast number of control 
policies for queueing networks. One famous approach is the MaxWeight policy, originally introduced 
by Tassiulas and Ephremides in [5], which is known to be throughput optimal. However, it is seldom 
used in practice in its pure form since it potentially leads to large delays. Even worse, under low load 
MaxWeight can behave entirely irrational looping single packets for a long period of time. An example 
network (originally presented in [8]) where MaxWeight shows such behavior is given in Fig. 11.11 Here, 
a control policy that is designed to minimize delay is expected not to route packets over the small loop. 
A lot of work was carried out regarding the issue of delay reduction in backpressure based policies 



(e.g. pQ [TT] [10] ) , a general class of throughput optimal policies with improved delay performance is 
presented recently in [7J. A survey on policy synthesis techniques can be found for example in [?]. 

Recently, an interesting new framework for policy design called h-MaxWeight has been proposed 
in which combines the MaxWeight philosophy with a cost criterion. The h-Max Weight can 

be seen as a Myopic policy where instead of the quadratic any other cost function can be applied. 
Stability of the policy is then achieved by perturbing the arguments of the cost function appropriately 
while still being reasonable 'close' to the original cost function. However, as already mentioned in 
[5] stability and also cost performance crucially depend on parameter choice which then has to be 
adapted in simulations. The latter point motivates a different perturbation technique proposed in this 
paper. It rests upon the observation that apart from technical stability arguments there is actually 
no specific reason to apply the perturbation technique solely in the argument of the cost function. 
Instead, when directly applied to the weights the approach becomes much more flexible and, in sharp 
contrast to [5], can guarantee universal stability while still maintaining asymptotic cost optimality 
with, in addition, even better cost performance in the non- asymptotic regime. Universal stability and 
high traffic asymptotic cost optimality analysis becomes involved though since there is in general no 
longer a natural Ljapunov candidate from the cost function; the theoretical arsenal to circumvent this 
technical challenge is established in this paper. 

Notation. We use boldface letters to denote vectors and matrices and common letters with sub- 
script are the elements, such that Ai is the i'th element of vector A and -By is the element in row i 
and column j of matrix B. Moreover A T refers to the transpose of A. E{X} denotes the expected 
value of random variable X . Let J denote the identity matrix of appropriate dimension. Furthermore 
we denote 1 the vector of all ones. diag(ai, <22, ■■■) refers to a diagonal matrix built from the elements 
ai,a,2, •■■ and || ■ ||j denotes the U vector norm and ||x|| is an arbitrary norm. Furthermore we use A c 
to denote the complement of a set A. The probability of A is denoted as Pr{„4}. The indicator !{■} 
equals 1 if the argument is true and equals otherwise. 



a — > zmO 




Fig. 1.1. Exemplary network where MaxWeight shows poor delay performance 
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2. System Model. Similar to [5] we use a simple stochastic network model: the Controlled 
Random Walk (CRW) model. We consider a queueing network with m queues in total representing 
m physical buffers with unlimited storage capacity. We arrange the queue backlog in the vector Q, 
such that Q = [Q\, ■ ■ ■ , Q m ] T which we refer to as the queue state. Let M. be the set of queue indices. 
Suppose that the evolution of the queueing system is time slotted with t G No- Then, the CRW model 
is defined by queueing law: 



Q (t + 1) = [Q (t) + B (t + 1) U (t)] + + A (t + 1) 



(2.1) 



a e 
B g 



Here, the vector process A (t) g N™ is the (exogenous) influx to the queueing system with mean 
E? (vector of arrival rates in packets per slot); B (t) g Z™ xi is a matrix process with average 
Z™ x ', containing both information about network topology (that is, connectivity or routing 
paths) and service ratefl The control u = U (t) in slot t is an element of the set {0, 1}' constrained 
by Cu < 1 using the binary constituency matrix C G Zq™ x 1 (with l m > being the number of 
resource constraints in the network). For the sake of notational simplicity we omit the time index in 
the following where possible. Throughout the entire paper x g N™ denotes the actual backlog. 

In what follows, the queueing system (|2.1[) is assumed to be a Jo-irreducible Markov chain (So 
being the point measure at x = 0). 

2.1. Example. Consider the introductory example of Fig. 11.11 We have traffic arriving at queue 
Qi and leaving the network after being processed at Q4. Moreover in each time slot traffic from queue 
Q3 can be routed cither to Q4 or to Q5, not both, thus U3 + U4 < 1. We assume arrivals with rate 
a > and set all service rates equal to 1 (thus for the network to be stabilizable it is required that 
a < 1). The corresponding CRW model is given by: 
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Let us present some preliminary results next. 
3. Preliminaries. 

3.1. Stability. The stability of an Jo-irreducible Markov chain can be defined in different man- 
ners. We first recall the definition of recurrent Markov chain as given in [5] based on the measure of 
the occupation time 



VA :=X]l(Q(t)6X) 



which gives the number of visits in a set A C M™ by a Markov chain after time zero. A Markov chain 
is recurrent, if it holds Ejr^} = +00, for any set A C M™. Additionally, if the Markov chain admits 



1 If not stated otherwise, service rates are usually assumed to be one throughout the paper. 
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an invariant probability measure, then it is positive recurrent. If the <5o-irreducible Markov chain is 
positive recurrent, it is also weakly stable 13] so that it holds 

lim Pr(||Q(t)||>S(e))<e 

t— >+oo 

for any e > and some constant B (e) > 0. In this paper we also apply the following stability definition. 
Definition 3.1. A Markov chain is called f-stable, if there is an unbounded function f : M™ — > 
such that for any < B < +oo the set B := {x : / (x) < B} is compact, and furthermore it holds 

limsupE{/(Q(t))} < +oo. (3.1) 

t— > + oo 

In the definition the function / is unbounded in all positive directions so that / (x) — > oo if 
||x|| — > oo. Choosing directly / (x) = ||x||, Definition l3.1l is equivalent to the definition of strongly stable 
[3] which implies weak stability. Clearly, for any / (x) which grows faster than ||x||, inequality (|3.1|) 
implies that the Markov chain is strongly stable. We call a vector of arrival rates a G K™ stabilizable 
when the corresponding queueing system driven by some specific scheduling policy is positive recurrent. 

A scheduling policy is now called throughput optimal if it keeps the Markov chain positive recurrent 
for any vector of arrival rates a for which a stabilizing policy exists. It is easy to show that for our 
model, by introducing the velocity set 

V := {v G R™,v = Bu + a}, 

the system is stabilizable if and only if the interior of V contains v = [5] . 
3.2. Myopic Policies: h-MaxWeight. Let us introduce a cost function 

c : -> K+,x ^ c(x) , 

assigning any queue state a non-negative number. Typically, the goal is to minimize the average cost 
over a given finite or infinite time period or some discounted cost criterion. The optimal solution to 
the resulting problems -which in discrete time can be modelled as a Markov Decision Problem- can 
be found by dynamic programming, which is, however, infeasible for large networks. 

A simple approach to queueing network control is the myopic or greedy policy. Such a policy 
selects the control decision that minimizes the expected cost only for the next time slot. Many policies 
can be considered myopic: for example it is shown in [5] that taking c (x) = x T Dx, for some positive 
diagonal matrix D, the corresponding MaxWeight policy is throughput optimal. However, very little 
is known about stability properties of other cost function families. 

In [3], a cost function based policy design framework called /i-Max Weight is introduced which is 
a generalization of the MaxWeight policy. Meyn considers a slightly different definition of the CRW 
model, which is characterized by queueing law: 

Q (t + 1) = Q (t) + B (t + 1) U (t) + A (t + 1) (3.2) 

The control U (t) G N l is an element of the region 

U* (x) :=W(x)n{0, 1}', 

with 

U (x) := {u G M. l + : Cu < 1, [Bu + a] i > for x, = 0} . 
In the ft-MaxWeight based control policy, the control vector is derived according to 



arg min < \7h(x), Bu + a > 
ue«'(x) 



(3.3) 
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Thus, the policy is myopic with respect to the gradient of some perturbation h of the underlying 
cost function. Meyn develops two main constraints on the function h: the first requires the partial 
derivative of h to vanish when queues become empty: 

dh 

t^>) = i£xi=0 (3.4) 

Moreover the dynamic programming inequality has to hold for the function h: 

min < Vh(x), Bu + a >< —c(x) (3.5) 

When h is non-quadratic, the derivative condition (I3.4[) is not always fulfilled. Therefore a perturbation 
technique is used where h(x) = h (x), hence it is a perturbation of a function h . Two perturbations 
are proposed: an exponential perturbation with 9 > 1 given by 



[e-~-l), (3.6) 

and a logarithmic perturbation with 9 > defined as 

it := sa log (l + y) • (3.7) 

While the first approach shows better performance in simulations, the stability of the resulting policy 
depends on the parameter 9 being sufficiently large (determined by the considered network setting). 
This is overcome by the second perturbation which is stabilizing for each feasible 9, however it comes 
with the additional constraint 

^-(x)>e Xi , WeM (3.8) 

which is a significant limitation on the space of functions that can be chosen as ho- Regarding the 
stability of the /i-MaxWeight policy, Meyn devised the following theorem. 

Theorem 3.2 (Theorem 1.1 from [5]). Consider the model h3.2\) satisfying the following condi- 
tions: 

1. The i.i.d. process (A,B) has integer entries, and a finite second moment. 

2. Bij(t) > —1 for each i,j and t, and for each j 6 {1,...,Z„} there exists a unique value 
ij G {1, . . . , 1} satisfying 

Bij(t) > a.s. Vi 7^ ij. 

3. The function ho : R m — > R+ satisfies the following: 

(a) Smoothness: The gradient V/io is Lipschitz continuous, 

(b) Monotonicity: Vh (x) £ for x eR l + , 

(c) The dynamic programing inequality \3. 5\) holds, with c a norm on R' . 

Then, there exists 9 < oo and f)h < oo such that for any 9 > 9 a , the following bound holds under 
the h-MaxWeight policy with perturbation i3.6}) : 

E{h(Q(t + 1)) - h(Q(t))\Q(t) =x}< -~c(x) + lfj h 

Consequently, it is: 

n ~ lE | C (Q W) | ^ 2n" 1 /i(cc) +fj h , n>l,xeZ l + 

Already in [S], Meyn mentioned some of the drawbacks of this /i-Max Weight policy: it depends 
crucially on parameter choice and is therefore not throughput optimal (which actually motivated the 



G 



second perturbation (|3.7[1 ). Furthermore, the approach also depends on additional constraints on the 
network topology (cf. Theorem 13.21 Condition 2)) so that it is not easily applicable to more general 
networks. A different approach is described next: note at first that it is by no means necessary to 
choose the perturbation as in (|3 . 6[) as long as we stay reasonably 'close' to ho while still maintaining 
stability (the only argument is technical because there is typically no longer a natural Ljapunov 
candidate obtained from ho)- By contrast here, we directly start with the function fi(x) := Vh(x) and 
derive conditions properties that guarantee throughput optimality, irrespective of the actual parameters 
chosen. 

4. /Lt-MaxWeight Network Control. 

4.1. Sufficient Stability Conditions. In this section, we give generalized sufficient conditions 
for throughput optimality for the systems (|2.1[) . (|3.2j) . In what follows, we consider scheduling policies 
of the form 

u*(x) = argmin (ft (x) , Bu + a), (4-1) 

uGK™ :Cu<l 

where /j, (x) is a vector valued function WT: — > R™, which is called the weight vector for some actual 
queue state x. Note that fi is reminiscient of a vector field and can thus be interpreted as a scheduling 
field for which we present a stability characterization. Observe that by construction of the policy we 
can normalize the weight vector as 

* w = islk (4 - 2) 

without loss of generality and hence ^(x^L = 1. Furthermore, we assume that the resulting policy 
is non-idling, i.e. ||^t(x)||, = if and only if x = 0. 



Theorem 4.1. Consider the queueing system \2. 1\) driven by the control policy \4-l\) with some 
scheduling field fi. The policy is throughput optimal if the corresponding normalized scheduling field 
given in Eqn. {4-ty fulfills the following conditions: 

1. Given any < t\ < 1 and C\ > 0, there is some B\ > so that for any Ax sK m with 
||Ax|| <Ci, we have |/2j (x + Ax) - pi (x)| < e x for any x G M™ with ||x|| > B\, Mi G M. 

2. Given any < ti < 1 and C% > 0, there is some E>2 > so that for any x G R™ with ||x|| > B2 
and Xi < G%, we have £Zj(x) < e-i, for any i G M. 

Moreover, for any stabilizable arrival process the queueing system is f-stable under the given policy 
where f is an unbounded function as defined in DeHnition \3.1\ The exact formulation of f depends on 
the field /i(x). 

Proof. The proof is given in Appendix [A] □ 

Remarkably, there is no a priori need for the dynamic programming inequality (it follows). Theo- 
rem 14.11 can be further refined and tailored to the situation in Theorem 13.21 

Corollary 4.2. Consider the queueing system h3.2\) driven by the control policy i3.3\) with 
some cost function h. Suppose the corresponding scheduling field fi(x) := V/i(x) is continuously 
differ entiable and Condition 2) in Theorem \3.2\ on network topology {!?(•)} holds. Then, the following 
conditions are sufficient for throughput optimality: 

1. For any e > there is some C* > so that for all \\x\\ > C x : 

|| V log ( Mi (a;)) || <e, WeM 

2. Ifxi = then ^(x) = 0, Vi G M. 

Proof. The proof can be found in Appendix [Bj □ 

Remark 1 . The restriction on the network topology in Corollary \4-2\ can be omitted if Condition 
2) is replaced with Condition 2) of Theorem \4-l\ 
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Theorem 14.11 can also be tailored to policies with simple perturbations (by simple we mean each 
component is perturbed by a single parameter). 

Corollary 4.3. Suppose, everything is as in Corollary 1. Let the scheduling field be defined as 
H(x) := V/lo(x) for some given simple perturbation x. Then, for some e > 0, 

d£i . dx~i 

— — is Lipschitz, ana — > oo, X\ — > oo, 

OXi OXi 

dh ... , dh (d£i\ 1+6 

— — is Lipschitz, and — — (x) > — — , Xi — > oo, 

OXi OXi V ux% J 

is sufficient for stability. 

Proof. The proof can be found in Appendix [C] □ 

Remark 2. The conditions in Corollary \4-3\ cover indeed a larger class of throughput optimal 
policies compared e.g. to the perturbation in {3.7}) together with condition (3.8}) since it is only required 
that grows as log 1+e X{ in each component (observe that = log (l + ^) + g ^ x which is also 
Lipschitz). 

Remark 3. (On extensions) A weaker condition than Condition 1) in Corollary \4- 2} is as follows: 
for any e > there is some C** > so that for all \\x\\ > C^* : 

||VA*i(x)||<eHx)||, VieM 

We conjecture that this will establish the most general condition. Furthermore, we showed in the 
wireless broadcast setting that the conditions are also necessary if the boundary of the feasible (rate) 
region contains differentiable parts, i.e. parts where the hyperplanes defined through the scheduling field 
are uniquely supported [12]. 



5. Policy Design. Corollary 14.21 makes it apparent that one is not confined to the specific policy 
design rule in Theorem 13.21 and can ensure much easier throughput optimality by directly modifying 
the scheduling field appropriately. Let us demonstrate this by a simple example: 

We consider a simple network of two queues in tandem. Assume we have a linear cost function 
c(x) = c\X\ + C2X2- Similar to [5] we perturb the optimal value function from the fluid model J* 
(which is known in this setting); thus (assuming parameters as in [5]) we have 

ho(x) = J*(x) = 2^1(^1 + x i) 2 + \ d ^ x \ 
with d\ = — — — and & C2 ~ Cl . The gradient V/i is then given by: 



d\(x\ +»2)(1 — e e 1 ) 
{d\(x\ + X2) + d 2 x 2 ){l - e~~s-) 



Vh{x) = 

Thereby, the exponential perturbation (|3.6[) was used. The /i-MaxWeight policy which we obtain 
using this function in (|3 . 3[) does not fulfill the conditions of Corollary 14. 21 for throughput optimality. 
Therefore we can (intuitively) derive a different perturbation: 



H(x) 

Setting 



di(xi +Xi)(l - e e d+*2)) 
(di(xx + X2) + diXi)(l - e~ 



P,(x) ;= dlag^-exp^- 9(1 + £^ xj) jj (5.1) 

the policy can be concisely written as fi(x) — Pg (x) Vh (x). It can be easily verified that the 
conditions of Corollary 14.21 hold for suitable ho, i.e. it is throughput optimal for any 9 > 0. Observe 
also that we have incorporated the useful property that queues will not be served when other queues 
tend to infinity. 



8 



5.1. Numerical Results. Subsequently, we numerically compare the policies obtained with the 
/J.-M&X Weight framework to Max Weight and the generalized h-M&x Weight. For this we consider the 
introductory example of Fig. 11.11 described in detail in the context of the CRW model in Section 
2.11 As mentioned before, Max Weight can show bad performance w.r.t. delay especially under low 
load. We want to evaluate whether we can improve that performance using the cost-function based 
approach. Since the optimal value function from the fluid model is not readily available we assume the 
linear cost-function c(x) — c T x for simplicity; hence, the resulting weight vector is fi(x) = Pi (x) c 
with Pi (x) given in (15.11) with 6=1. 



Fig. 5.1(a) and Fig. 5.1(b) depict the average cost for different arrival rates after a running time of 
10000 time slots. Fig. 5.1(a) shows the case when all c, = 1. It can be observed that our cost-function 
based approach provides significant gains over MaxWeight at all arrival rates. As expected the gain 
increases with decreasing network load. Moreover, we observe a similar performance as the generalized 
/i-MaxWeight policies (8 chosen sufficiently large) even though throughput-optimality is imposed. 



MaxWeight 

h-MaxWeight, Exp. Perturbation 
■ h-MaxWeight, Log. Perturbation 
- ii-MaxWeight 




MaxWeight 

h-MaxWeight, Exp. Perturbation 
' h-MaxWeight, Log. Perturbation 
-H-MaxWeight 



(a) Equal weights 



(b) Unequal weights 



Fig. 5.1. Numerical comparison of the average cost of different policies. 



Since our approach is based on a cost metric it is natural to ask how it behaves in case the queues 
are not weighted equally. For example assume we want to discourage the use of the reverse loop, thus 
we set C5 > Ci for all z ^ 5. Fig. 5.1(b) compares the control policies assuming C5 = 5. In this 
case, when the load increases our approach even outperforms /i-MaxWeight with both exponential and 
logarithmic perturbation, while at the same time providing throughput optimality. 

6. Conclusion. We introduced a control policy synthesis framework for queueing networks that 
combines throughput optimality per design with optimization with respect to an arbitrary cost metric. 
To design such a policy we derive fundamental theoretical conditions that guarantee universal stabil- 
ity and can easily be checked. We have shown that we can achieve higher performance gains both 
over classical MaxWeight routing as well as generalized MaxWeight algorithms, however, without the 
inherent limitations such as parameter dependent stability or additional constraints on the network 
model. 

Appendix A. Proof of Theorem 14.11 

Stability can be proven by establishing the so-called Lyapunov drift criteria as given in [9J [3J. 
That is to say if we can find some non- negative V(x) : — > R+, some 6 > and a compact region 
B := {x : ||x|| < B} such that 



:;{ 1 iQ (t + i))|Q(t)} < +00 

AV(Q(i)) < -6 



VQ (t) e B 
VQ (t) e B c 



(A.l) 
(A.2) 
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the queueing system is positive recurrent. Here, AV (Q (<)) is the one step drift defined as 

AV (Q (*)) := E {V (Q (t + 1)) - V (Q (t))\ Q(t)}. 

Furthermore, if for some > 0, it satisfies 

AV (x) < -Of (x) , VQ (t) G £ c , (A.3) 

for some B > and unbounded positive function / (x) , it can be shown that the queueing system is 
f- stable. 

We carry out the proof in two steps. First, we prove throughput optimality for those policies, 
whose corresponding fields /i(x) fulfill the integrability condition in Eqn. (|A.4|) below. The fields in 
those policies can be regarded as the normalized gradient of a certain potential field V(x). We show 
that the expected drift AT^ (x) satisfies the inequality (|A.3[) and hence the system driven by those 
policies is stable. In the second step, we extend the results to all other policies whose corresponding 
fields are not integrable. It is shown that if the policies fulfill the condition given in the theorem, their 
fields /i(x) can be approximated by some /2(x) which is integrable. Then we prove the drift condition 
AV (x) for those policies and establish the stability. 

First, we analyze the subclass of fields whose /2j(x) are continuously differentiable. Furthermore, 
we assume that the field satisfy the integrability condition, i.e., 

^%W) = ^%W) ; wjeM. (A.4) 
oxj oxi 

For such scheduling policies, we have the following lemma. 

Lemma A.l. If Eqn. \A.4\ ) holds for all x 6 R™ , then any stabilizable vector of arrival rates a. is 
also stabilizable under the corresponding scheduling policy as long as ju(x) fulfills the conditions given 
in Theorem \4-l\ 

Proof. Condition (|A.4[) implies that the vector field defined by /i(x) has the path independence 
property, namely the integral of £t(x) along a path depends only on the start and end points of 
that path, not the particular route taken. According to Poincare lemma, the vector field /i(x) is 
completely integrable and it is the gradient of a scalar field, that is to say, there exist some function 
/(x) : R™ R + with 

T!?-*W- < A ' 5 > 

Setting the value of /(x) at the origin equal zero, /(x) at the point x can be calculated by 

/(x) = /" ' 2 H(tx.) T Mt, (A.6) 
Jo 

where x := is the normalized vector of x. Since each element of /i(x) is larger than or equal to 
zero, /(x) is a positive function. Moreover, if ||x|| becomes large, according to Condition 2) in the 
Theorem 14. l\ for z-th queue with bounded Xi, Xi — >• results in /2j(x) — > 0. Then for other queues 
with /Sj(x) > Cfj,, Xj grows with increasing ||x|| and we have Xj > C x for some C x and C M > 0. Thus 
it holds 

A(x) T x > C 

for some C > if ||x|| is sufficiently large. Considering Eqn. (|A.6|) . it follows that /(x) —5- +oo as 
||x|| — > +oo. Therefore, /(x) is a positive, unbounded function as we used in Definition 13.11 
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Observing a new vector field denned by f (x) = /(x)/x(x), we have 

9(^(x)) _ 9(/(x)/2i(x)) 



eta, 



Mj ( x ) Mi (x) + 
A? ( x ) M* (x) + 

dxi 

0Mx)) 



gAj (x) 

dpi (x) 



Vi,j £ -M. 



(A.7) 



Condition (IA.7I) ensures that i/(x) is also the gradient of a scalar field and there is a function V(x) 
RT -» K+ with 



ay(x) 



= /(x)Mi(x), 



where /(x) is the magnitude of the gradient and /j(x) is the direction of the gradient. Set V(0) = 
and V(x) at the point x is 



l|x|| ; 



/ (ix) \i (fx) xdt. 



It is easy to show that the function V(x) is also a positive, unbounded function. We use the function 
V(x) as our Lyapunov function in the proof. 

Subsequently, let r G be the vector of network induced arrivals plus departures and a G M™ 
be the vector of exogenous arrivals. Moreover, let z G M™ be the vector of number of excess packets 
compensating for the case when more packets are attempted to be removed than the queue contains. 
The first condition of the Lyapunov function given in (IA.1I) is satisfied as long as a j , r$ , Vz G M. , are 
bounded. Next we analyze the second condition, namely the drift of V(x) of the queueing system. 

Using the mean value theorem of differential calculus we have for some x between x and x + Ax 
i.e. Xi — KiXi + (1 — Ki) Xi + Axi, Vi G M., for some Ki G [0, 1]: 



I i=l 

Considering the first part in (|A.8[) ■ we have 

( m 

El ^/(x)^(x) (a, 



^/(x)/ii(x)Zi 







(A.8) 



(A.9) 



</(x)(^/i < (x)a i -2/i < (x)E{r i |x} 

\i=l i=\ , 

( in 

+ ¥.1 ^|/(x)Ai(x) -f(x)fii(x)\\ai- 



i=l 



(A.10) 



Since 



E{r|x} = Bu*{x) 

= B argmin (fj,(x),Bu) 
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for any stabilizable a we can always find some T > 0, so that 

E^J2Mx)(c*i-n) xj < -r. 

Hence the first part in (|A.10I) 

(mm \ 
^ (*h - ^ ( x ) E { r > i x ) ) 
8=1 1=1 / 

<-r/(x). 

For the second part in (|A.10[) . we define Ax = x x. Then 

/(x + Ax) - /(x) = / /2(x + tAZ)AZdt < [ WAH^ dt = HAxlli 
Jo Jo 

Since and are bounded, we choose some C3 > 1 so that < C3 and < C3 for all i. Then 
||Ax||-, is bounded by 2771C3 and we have 

|/(x)-/(x)|<e 3 /(x) 

for any given £3 > and sufficiently large ||x||. According to Condition 1) in Theorem 14.11 we also 
have 

|/2i(x) - /2i(x)| < e x . 

Then if ||x|| is sufficiently large, 

{m 
J2\f(*)fli(Z)-f(x)Pi(x)\Wi-ri\ 
i=i 

( rn 

<2C 3 E \ (/(x) + e 3 /(x)) (/2i(x) + e x ) 



2C 3 Ei ^/(x)Mi(x) 



=(2mC 3 ei + 2C 3 e 3 + 2mC , 3 e 1 e 3 )/(x) 



(A.ll) 



holds for any t\, €3 > 0. Hence we have ci — » when ||x|| — > +00. 
Now we consider the second part in (|A.8|) . 



El £/(x)ft(x)«i 
I i=i 

<E^/(x)^(x)z. ( 



;=i 



X! I/( X )^( X ) - /( x )Mi(x) 



i=l 



For the first part in (|A.12[) . since < we have for some C4 > we have 

E{z;(t)}<c 4 . 



(A.12) 



(A.13) 
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We define the set Q := {i : Zi > 0,i G M}. Since r*j < C 3 is bounded by C 3 , then < C 3 , Vi G 5- If 
|x|| is sufficiently large so that ||x|| > mC 3 , we can exclude the case Q — M. According to Condition 
2) we have /Si(x) < £2, Vi G (/ for arbitrarily small £2. Then 




^ /(x)^(x)zj 



x ^ < TOC 4 e 2 /(x) (A.14) 



holds. 

Using the same proof method as for (|A.11[) it can be shown that the second part in (|A.12I) can be 
bounded by 02/ (x) for any 02 > 0. 

Define 9 — F — <j\ — mC^e% — 02 and choose a\, o~i, £2 so that > we have the drift 

AV(x) < -0/(x) (A.15) 

and which is negative and the Markov chain is f-stable. □ 

Lemma IA.ll is applied to fields which are completely integrable which is of course too restrictive. 
However, it can be shown that if /i(x) has the properties described in Theorem 14. 11 it can be approxi- 
mated by some (at least piecewise integrable) function /i(x). The following lemma helps us to achieve 
our main result. 

Lemma A. 2. If the function /i(x) fulfills the Condition 1), 2) in Theorem \4-l\ then there exists 
a positive, unbounded function f : — > R + as given in Definition \3.1l and a positive, continuous, 
piecewise differ entiable function V : M™ — > R +7 such that it holds 

^P^ = /(x)/I i (x),ViGA< (A.16) 

on each differ entiable subdomain of V , and 

|/*i(x) -M»(x)| < e 4 ,Vi eM, (A.17) 

for any £4 > if ||x|| is sufficiently large. 

Proof. In the following we show how to construct the function V(x), /(x) and /i(x) based on 
jii(x). Since we only need to ensure that |/ii(x) — /ij(x)| < £4 for large ||x||, it is sufficient to construct 
the functions on the domain where ||x|| > B for sufficiently large B. The function V and / on the 
domain ||x|| < B can be defined as any positive, bounded, continuously differcntiable function, which 
is continuous on the boundary |jx|| = B. 

In the domain of ||x|| > B, we at first construct an orthogonal grid such that each cell in the grid 
is a rectangle (see Fig lA.ll for an example in m = 3-dimension). Start by a point x a = X G R™, the 
next cell in the dimensions i, j (see Fig |A.2l) has the grid points 



m I 



x a = [X\ , . . . , Xi , . . . , Xj , . .., X 

x b = [X\, Xi + AXi, Xj, X rn ] T , 
x c = [X\ , ... , Xi , Xj + AXj , . . . , X m ] T , 



x d = [X\, Xi + AX t , Xj + AXj, X m ] T . 
The length of the cell A Xi , AXj is determined by the equation 

fii(...,Xi,Xj,...) - fii(...,Xi,Xj + AX 3 , ...)dxi 











p,j(...,Xi,Xj,...) - flj(...,Xi + AXi,Xj, ...)dxj. (A.18) 
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5> 



4- 




Fig. A.l. Orthogonal grid (irregular) in m = 3-dimension. The line integral between two points on the grid line 
(e.g. along the two paths marked by dashed line) depends only on the start point x° and end point x*. It is independent 
of the chosen pathes 



Condition 2) in Theorem 14.11 implies that in the region ||x|| > B for some large constant B, the 
function fi.i(x) decreases with increasing Xj and p,j(x) decreases with increasing Xi as well. Hence 

p,i(...,Xi,Xj, ...) - p,i(..., Xi,Xj + AXj, ...) > 
p,j(..,,Xi,Xj,.,.) -pj(...,Xi + AXi,Xj,...) > 

and Eqn. (|A.18|) has positive general solutions with AX}, AXj > 0. Iteratively take x b , x c and x d as 
start point, we can extend the grid until it covers the subdomain in the dimensions i, j. Based on the 
existing grid lines in the dimensions i, j (e.g. the line x a x b in Fig. IA. 1|) . we can repeat the process in a 
further dimension k and construct the grid in this dimension (the grid x a -x b -x e -x^). Since relationship 
of AXi and AXj is determined by the definition of /2j(x) on the particular points, each rectangle in 
the grid has different height and width so that the constructed grid has a irregular pattern. 

Denote the path starting at x a via x b to x d as S a bd an d the path starting at x a via x c to x d as 
Sacd, Eqn. (|A.18[) ensures that the integral of the function £t(x) along the path S a bd equals the integral 
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along the path S ac d, which is 

/ A( x ) ' ^ s 



AX, 

i^i ( ■ ■ • j , Xj , . . . ) d>Xi 


+ / fij(...,Xi + AXi,Xj, ...)dxj 
Jo 







Mi ( • * • ) -^"i ? j ? ■ • ■ ) 



P AX, 

+ / Ji,i(...,Xi,Xj + AXj, ... )cfoj 



: / fl(x) ■ ds. 

Js acd 



(A.19) 



Since Eqn. (|A.19[) holds for all cells of the grid, the integral between arbitrary two grid points along 
any grid line has the same value. Hence the vector field /i(x) can be considered as "path-independent" 
along the grid lines. Then we define a function /(x) whose value on the grid line as the integral of 
/i(x) along the grid lines, i.e. 



/(x*) :=/(X°)+ f /i(x)-ds, 
Js 



is 

where x* is a point on the grid line and S is an arbitrary path between x* and the initial point X° 
along the grid lines. 

Define a new vector field by f(x) := /(x)/i(x), the line integral of f(x) along the path S a bc is 



/ i/(x) • ds= / /(x)/x(x) • ds 

J Sabd J Sabd 

/(x)d/(x) 



s„ 



= \ (f (x rf ) - f (*•)) 



/ i^(x) • ds. 

Js acd 



Thus the integral of the vector field v (x) between two grid points along the grid lines is also independent 
of the chosen paths. Then we define a scalar field V^(x) whose value on the grid line is given by 



V(x*) := V(X°) + / /(x)A(x) • ds. 
Js 



The value of /(X°) and V(X°) at the initial point X° can be chosen as an arbitrary positive constant. 
Since ^(x) > 0, Vi G M, we have /(x*) -> +oo and V(x*) -> +oo as ||x*|| -> +oo. 

Once the value of V(x*) is fixed on the grid lines, we obtain the value of V inside a grid cell by 
the linear interpolation of V^(x*) along the lines parallel to the diagonal line (see Fig |A.2p . i.e. in the 



lower triangle with + -^f- < 1, V is defined 



as 



V(...,Xi + Ax^Xj + Ax 3 , ...) = K t V{-K L ) + K 3 V(x J ), (A.20) 
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where 



Ki = 



AXjAxi 



AXjAxi + AX z Axj 

AXiAxj 
AXjAxi + AX.Axj 



AX- 

x J = [X 1 ,...,X i +Ax i + -^Ax j ,X j ,...,X m ] T , 

AX ■ 

x J = [Xi,...,X it Xj + Ax 3 + —^A Xl ,...,X m } T 

A Al- 



and in the higher triangle with -^p- + -^S 2 - > 1, V is defined 



as 



where 



V(...,Xi + Axi,Xj + Axj, ...) = K t V{^) + K V{x J ), (A.21) 
AX j AX i - AXjAxi 



Ki = 



2AXiAXj - AX 3 Ax t - AX.Axj 
AXjAXi - AX z Axj 



2 AX, AX j - AXjAxi - AXiAxj 
AX 

x 7 = [...,Xi + Axi + -r-^-Axj - AX h X j + AXj, ...f, 
AXj 

AX ■ 

x J = [...,Xi + AXi , Xj + Axj + -xj^ Ax i ~ AX i> -F- 

Eqn. (|A.20[) and (IA.21|) determine the value of V^(x) on the orthogonal planes stretched by the 
grid, then the value of V(x) in the space between these planes is calculated by the linear interpolation 
of the existing value in further dimensions. Similarly, we can also define the value of /(x) in the entire 
domain. 

Observing the function V^(x), we can see that it is continuous in R™ and differentiable in each 
subspace bounded by the grid lines and diagonal lines. For two points x and x' which lie in the same cell, 
under Condition 1) in Theorem 14. II we have \p,i (x) — jli (x')| < ei and hence / (x) — / (x')| < e\f (x) 
for arbitrarily small e\ > 0. Then for Eqn. (|A.20I) it holds 

V&) = V(x a ) + f (x) (jm (x) + £< (x)) (ax> + ^Ar f ) , 
V(x J ) = V(x a ) + f (x) (fij (x) + ej -(x)) (axj + ^Ax, 



and further 



F(x) = V(x a ) + / (x) (ft (x) + £i (x)) Ax, + / (x) fa (x) + ej (x)) A Xj , 



where the deviation e i (x),£ J (x) —y as ||x|| — > +oo. Similarly we can also obtain the same result for 
Eqn. (|a~2T|) . 

Then the partial derivative of V is 

= /(x) (^( x )+e 4 ). 
for arbitrarily small 64 > and we obtain the Lemma IA.21 □ 
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Fig. A. 2. T/ie value ofV(x) := V(...,Xi + Axi,Xj + Axj,...) is calculated by the linear interpolation between the 
value V(x I ) and V(x I ) defined on the grid lines. The line x 7 x J is parallel to the diagonal x b x c 



It can be shown that /(x) and V(x) constructed in Lemma I A. 2 1 are positive and grow to infinity as 
|x|| — > +oo. Now we use the function V(x) and /(x) in Lemma I A. 2 1 as the Lyapunov function and the 
stability measure, respectively. It can also be shown that AV(x) is bounded if x lies in some compacted 
region B and the arrival rates a* and transmission rates n are bounded. Hence the Lyapunov condition 
(|A.1|) is satisfied. 

Next we consider the drift AV(x) in Lyapunov condition (|A.2[) where x £ B c . The connection 
between x and x + Ax probably pass through multiple differentiable subspaces of V(x) (see Fig |A.4[) . 
so we denote the intersection of the connecting line and the boundary of the subspaces as xW, x^) 
and the difference as Ax' 1 ' = x^ — x,..., Ax' 1 ' = x^ +1 ' — x^. The drift is written as 



A7(x) =E <^ V(x + Ax) - K(x (i) ) + V{x l+1 ) - V(x w ) + V(x w ) - V(x) 



L + l 



. 1=1 
' L + l 



1 = 1 



1=2 



• ^/(x«)/x(xW)-Ax 



(0 



<E ^/(x«)/x(xW).Ax 



x 



(0 



C4 



/(x«) 



where xr' is some point in the l-th subspace. Since the arrival rates a, and the transmission rates 
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Fig. A. 3. The Lyapunov function V(x) is differentiable inside the subdomain between x t *,x i, ,x c and the subdomain 
between x b ,x c ,x d 



ri are bounded for all i € At, the difference ||Ax|| is bounded. Thus according to Condition 1) in 
Theorem we have \p, i (x ( - l '>) - ^(x^)! < ei and |/(x w ) - /(x (1) )| < eif (x (1) ) for arbitrary e± > 



if ||xW|| is large. The drift 



L + l 



AV(x) 

<E J /(x«)/x(x«) ■ £ AxW + a 3 /(x) 

{ /(x( 1 ))/x(x( 1 )) • (x + Ax - x)| x} + er 3 /(x), 



<E 



where <7 3 is some small constant. 

Using the previous result in (jA.15[) . it holds 



AV(x) 

{m 
^ /(x)/2j(x) (aj - + z») 
i=i 

< - e/(x) + (7 3 /(x) 

< - 9'm 



X + <T 3 /(X) 
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Fig. A. 4. The drift AV crosses 5 subdomains, which can be written as the sum of the difference between V(x), 
V{xW), and V(x + Ax) 

for some 9' > if ||x|| > B, for some B > 0. The drift is negative thus the Markov chain is positive 
recurrent. 

At last, we prove that the chain is also f-stable for the magnitude function /(x). We can write 
E{y(x + Acc)|x} 

<E{y(x + Acc)|x > B}Pr(x > B) + E { V(x + Ax)\ x < B}Pr(x < B) 
<E { V(x) - e'f(x)\ x > B} Pr (x > B) + E { V(x + Ax)\ x < B} Pr (x < B) 
<E{V(x)}-0'f(x) + C 5 , 

where C5 is some constant satisfying 

C 5 > E{V{x + Ax)|x < B}Pr(x < B) 
+ E{6»'/(x)|x < S}Pr (x < B) . 

Using the telescoping machinery, the summation of the drift over T time slots yields 

T 

E |y(x T )} < E {^(x 1 )} - 8' E {/(x)} + T ■ C 5 , 

n=l 




19 



and since V(x) is non- negative function, it holds 



Hence we have 



lim sup I £ E {/(x)} < M + J < +00 

n— 1 

which completes the proof. 

Appendix B. Proof of Corollary 14.21 

By Condition 2) of Corollary 14. 2l we can assume that the random walk evolves on K™. Hence, we 
can skip Condition 2) of Theorem 14.11 since this condition (as its counterpart in Corollary 14. 2[) ensures 
positivity of the random walk. We need to show that from 

||Vlog Mi (^)||<e, Vi G M, \\x\\ > Cg (e) , (B.l) 
(where Ce (e) is sufficiently large) it follows: 

Hi(x + Ax) (ii(x) 



EjeM Mi ( x + Ax) E 3 eM Mi ( x ) 



< e (B.2) 



For orientation, let us assume more restrictive conditions first: take fii, Vi € M. Lipschitz continuous 
and let J2j eM A*i(x) — > 00 if ||x|| — >• 00. Note, that these conditions already encompasses Meyn's 
perturbation (|3.7p together with e.g. a linear cost function. 

It is easy to prove the corollary with these assumptions: by the mean value theorem we have 

Mi ( x + Ax) = /ij (x) + V^/ij (x) Ax 

where x is an (arbitrary) point on line connecting x and x + Ax whereas x is a point connecting 
x + Ax. Since the field is Lipschitz we have V#[ii(x) < C7 uniformly. Furthermore, since the 
policy is non-idling ^2j eM fJ,j (x + Ax) > Cg where the normalization constant C% can be chosen 
as large as possible without altering the policy (by the construction of the policy). Moreover, since 
YljzM ^j( x ) °°! \\ x \\ ~^ °°j condition (|B.2[) is equivalent to 

\Hi (x + Ax) - ^ (x)\ < e ^ Mi ( x ) 

and, again, by the mean value theorem: 

I V T ^i (x) Ax I < e fij (x) 

jeM 

Here, we tacitly assumed that we have selected x accordingly. Since Ax is fixed and by the positivity 
of ^ it is sufficient that 

II Vw (x)[| < p^M*(*) 

which is equivalent to condition (|B.1[) with some ||x|| > Cq (e') (e' slightly smaller). 
Let us now prove the general case. Condition (jB.ll) can be written as 

1 

V T Vij(x)Ax = e„, 



EieMMiH 
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for some x with ||as|| > C (e„) where e„ is a zero sequence and C (e„) is strictly increasing for any fixed 
Ate £ K m . Now, again, by the mean value theorem 



Hi(x + Ax) 



m{x) 



< e, 



(B.3) 



where we set a; as before and let x + Ax = x and x + Ax = x + Ax. x, x are points on the line 
connecting x and x respectively x and x + Ax. Note that Hj(x) is zero if and only if fii(x + Ax) and 
fii(x) are both zero since otherwise by condition (|B.1[) the gradient would be zero as well. Since in 
this case the condition is trivially satisfied so that we exclude it. 
Hence from (IB. 31) it follows 



(A) 



m(x + Ax) - m(x) 



V r /J 3 (i)Ai, 



V J (j,j(x)Ax 

H 3 {x) 
« ' 

(B) 



j£M 



V T Hj(x)Ax 



We can prove that, because of condition (|B.1[) . (A) and (B) are zero sequences: suppose V T /Zj(x) 
is non-zero (then we can stop anyway) then by the repeated application of the mean value theorem, 
denominator of, say, (A) can be written as: 

fij(x) = (JLj{x) + V /J>j(x 2 ) Ax 2 

This process generates sequences in R™ with x = x\,X2,~- and Ax = Ax\ C Ax2,... which are 

bounded and hence we can pick subsequences converging to some set of limit points cc^ , k — 1,2,.... 
Note that we can restrict the number of limit points to at most two since by defintion every limit 
point is visited arbitrarily often and infinitely close and by construction of the sequence there is no 
possibility of more than two limit points which neither contain the other in between them. Take these 
two limit points with corresponding subsequence xii , k = 1,2: by continuous differentiability we have 



Hj(xn) -t (j,j( x x>') an d V//j(aj 



V ^j(x < £ > ), k = 1, 2. It must also hold in the limit: 



m = 



(and vice versa). Since then 



f-tj («^OC )/^J*(»Z'00 ) 



<e, 



(and vice versa) where e > is arbitrarily small by condition (|B.1|I we conclude that fj,j (ajoo ) = fJ>j {x£! ) 
(but not necessarily x 



(i) 



Xqo ) - 



Now, we can proceed the process sufficiently often as 

\7 T fij(xi)Axi 



flj(x) 



< 



V T fij(x 1 )Ax 1 



/ij(iCl) 1 + 



< 
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such that in the final step 



V T /ij(a; n+ i)Aa;„ + i _ (V T fj,j(x^) + e^)Ax n 



+1 



= {V T ^{xt ] ) + ^)Ax n+1 

(Xjixoo) + e l n 
<e, fc,/ = l,2, 



by condition (|B.1|) . Hence, we have 

E^^^^ + ^SF) (i + 4) 



1 + e"', e' , e" zero sequences 



and further 



\m(x + Ax) - Mi(x)(l + e"')l < + Aa; ) - MiWI + MiO*OC 

<e E Mj(*)(1 + 0» 

which is equivalent to: 

\(M(x + Ax) - fii(x)\ < e ^ + 4) - e^iN- 

Since a: is arbitrary and can be suitably choosen, condition (IB.1[) with some > C$ (e"") is sufficient 
for the latter to hold. 

Appendix C. Proof of Corollary 14.31 

We can write 

where we defined I := Note, that here afj only depends on iEj. The gradient of the weight fJ-i(x) is 
given by: 

dtM^_i£-M)^(A) + l(x,)£-^(x) i = j 

d dh, 



-(*) 



Define a; A := a; + Aa; and a; A := x(x A ). From the proof of Corollary 14. 2l it is clear that we only have 
to show that 



\V T [ii(x)Ax\ 



< e, 



M* A )II 

for some e > arbitrarily small. This can be rewritten as: 

^{ Xl )^{x)A Xl + l{ Xl )^-^{x)Ax t l{x l )^ M ,^£-^( A )^ x l <t 
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Since §§a,Z are Lipschitz, thus af-fj 1 , ^- are uniformly bounded, and l(xi), %*%{x) > l 1+e (x,) ->■ oo 

when irj — > oo, the effect of Aa; vanishes in the denominator. The condition ^ L (x) > l 1+e (xi) is 
required since we have expressions of the form 

l{Xj)l(Xj) 

l {x ^(x)+l(x 3 )^(x) 

which then become arbitrarily small. 
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